Search CORE

56 research outputs found

Learning the Joint Representation of Heterogeneous Temporal Events for Clinical Endpoint Prediction

Author: Liu Luchen
Shen Jianhao
Tang Jian
Wang Zichang
Zhang Ming
Publication venue
Publication date: 25/04/2018
Field of study

The availability of a large amount of electronic health records (EHR) provides huge opportunities to improve health care service by mining these data. One important application is clinical endpoint prediction, which aims to predict whether a disease, a symptom or an abnormal lab test will happen in the future according to patients' history records. This paper develops deep learning techniques for clinical endpoint prediction, which are effective in many practical applications. However, the problem is very challenging since patients' history records contain multiple heterogeneous temporal events such as lab tests, diagnosis, and drug administrations. The visiting patterns of different types of events vary significantly, and there exist complex nonlinear relationships between different events. In this paper, we propose a novel model for learning the joint representation of heterogeneous temporal events. The model adds a new gate to control the visiting rates of different events which effectively models the irregular patterns of different events and their nonlinear correlations. Experiment results with real-world clinical data on the tasks of predicting death and abnormal lab tests prove the effectiveness of our proposed approach over competitive baselines.Comment: 8 pages, this paper has been accepted by AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Joint Language Semantic and Structure Embedding for Knowledge Graph Completion

Author: Gong Linyuan
Shen Jianhao
Song Dawn
Wang Chenguang
Publication venue
Publication date: 18/09/2022
Field of study

The task of completing knowledge triplets has broad downstream applications. Both structural and semantic information plays an important role in knowledge graph completion. Unlike previous approaches that rely on either the structures or semantics of the knowledge graphs, we propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information. Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models with respect to a probabilistic structured loss, where the forward pass of the language models captures semantics and the loss reconstructs structures. Our extensive experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method. We also show that our method can significantly improve the performance in a low-resource regime, thanks to the better use of semantics. The code and datasets are available at https://github.com/pkusjh/LASS.Comment: COLING 202

arXiv.org e-Print Archive

FIMO: A Challenge Formal Dataset for Automated Theorem Proving

Author: Ju Wei
Li Lin
Liu Chengwu
Liu Qun
Liu Zhengying
Shen Jianhao
Wang Haiming
Xin Huajian
Yin Yichun
Yuan Ye
Zhang Ming
Zheng Chuanyang
Publication venue
Publication date: 08/09/2023
Field of study

We present FIMO, an innovative dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems. Designed to facilitate advanced automated theorem proving at the IMO level, FIMO is currently tailored for the Lean formal language. It comprises 149 formal problem statements, accompanied by both informal problem descriptions and their corresponding LaTeX-based informal proofs. Through initial experiments involving GPT-4, our findings underscore the existing limitations in current methodologies, indicating a substantial journey ahead before achieving satisfactory IMO-level automated theorem proving outcomes

arXiv.org e-Print Archive

TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models

Author: Cao Qingxing
Guo Zhijiang
Huang Yinya
Li Lin
Liang Xiaodan
Liu Qun
Liu Zhengying
Shen Jianhao
Wang Haiming
Xiong Jing
Yin Yichun
Yuan Ye
Zhang Ming
Zheng Chuanyang
Publication venue
Publication date: 24/10/2023
Field of study

Automated theorem proving (ATP) has become an appealing domain for exploring the reasoning ability of the recent successful generative language models. However, current ATP benchmarks mainly focus on symbolic inference, but rarely involve the understanding of complex number combination reasoning. In this work, we propose TRIGO, an ATP benchmark that not only requires a model to reduce a trigonometric expression with step-by-step proofs but also evaluates a generative LM's reasoning ability on formulas and its capability to manipulate, group, and factor number terms. We gather trigonometric expressions and their reduced forms from the web, annotate the simplification process manually, and translate it into the Lean formal language system. We then automatically generate additional examples from the annotated samples to expand the dataset. Furthermore, we develop an automatic generator based on Lean-Gym to create dataset splits of varying difficulties and distributions in order to thoroughly analyze the model's generalization ability. Our extensive experiments show our proposed TRIGO poses a new challenge for advanced generative LM's including GPT-4 which is pre-trained on a considerable amount of open-source formal theorem-proving language data, and provide a new tool to study the generative LM's ability on both formal and mathematical reasoning.Comment: Accepted by EMNLP 2023. Code is available at https://github.com/menik1126/TRIG

arXiv.org e-Print Archive

A Comprehensive Survey on Deep Graph Representation Learning

Author: Fang Zheng
Gu Yiyang
Ju Wei
Liu Zequn
Long Qingqing
Luo Xiao
Qiao Ziyue
Qin Yifang
Shen Jianhao
Sun Fang
Xiao Zhiping
Yang Junwei
Yuan Jingyang
Zhang Ming
Zhao Yusheng
Publication venue
Publication date: 11/04/2023
Field of study

Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a relatively close distance, thereby preserving the structural information between the nodes in the graph. However, this is sub-optimal due to: (i) traditional methods have limited model capacity which limits the learning performance; (ii) existing techniques typically rely on unsupervised learning strategies and fail to couple with the latest learning paradigms; (iii) representation learning and downstream tasks are dependent on each other which should be jointly enhanced. With the remarkable success of deep learning, deep graph representation learning has shown great potential and advantages over shallow (traditional) methods, there exist a large number of deep graph representation learning techniques have been proposed in the past decade, especially graph neural networks. In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the-art literature. Specifically, we systematically summarize the essential components of graph representation learning and categorize existing approaches by the ways of graph neural network architectures and the most recent advanced learning paradigms. Moreover, this survey also provides the practical and promising applications of deep graph representation learning. Last but not least, we state new perspectives and suggest challenging directions which deserve further investigations in the future

arXiv.org e-Print Archive

Semaphorin 3A Contributes to Secondary Blood–Brain Barrier Damage After Traumatic Brain Injury

Semaphorin 3A (SEMA3A) is a member of the Semaphorins family, a class of membrane-associated protein that participates in the construction of nerve networks. SEMA3A has been reported to affect vascular permeability previously, but its influence in traumatic brain injury (TBI) is still unknown. To investigate the effects of SEMA3A, we used a mouse TBI model with a controlled cortical impact (CCI) device and a blood–brain barrier (BBB) injury model in vitro with oxygen-glucose deprivation (OGD). We tested post-TBI changes in SEMA3A, and its related receptors (Nrp-1 and plexin-A1) expression and distribution through western blotting and double-immunofluorescence staining, respectively. Neurological outcomes were evaluated by modified neurological severity scores (mNSSs) and beam-walking test. We examined BBB damage through Evans Blue dye extravasation, brain water content, and western blotting for VE-cadherin and p-VE-cadherin in vivo, and we examined the endothelial cell barrier through hopping probe ion conductance microscopy (HPICM), transwell leakage, and western blotting for VE-cadherin and p-VE-cadherin in vitro. Changes in miR-30b-5p were assessed by RT-PCR. Finally, the neuroprotective function of miR-30b-5p is measured by brain water content, mNSSs and beam-walking test. SEMA3A expression varied following TBI and peaked on the third day which expressed approximate fourfold increase compared with sham group, with the protein concentrated at the lesion boundary. SEMA3A contributed to neurological function deficits and secondary BBB damage in vivo. Our results demonstrated that SEMA3A level following OGD injury almost doubled than control group, and the negative effects of OGD injury can be improved by blocking SEMA3A expression. Furthermore, the expression of miR-30b-5p decreased approximate 40% at the third day and 60% at the seventh day post-CCI. OGD injury also exhibited an effect to approximately decrease 50% of miR-30b-5p expression. Additionally, the expression of SEMA3A post-TBI is regulated by miR-30b-5p, and miR-30b-5p could improve neurological outcomes post-TBI efficiently. Our results demonstrate that SEMA3A is a significant factor in secondary BBB damage after TBI and can be abolished by miR-30b-5p, which represents a potential therapeutic target

Directory of Open Access Journals

Concatenated Deep-Learning Framework for Multitask Change Detection of Optical and SAR Images

Author: Huanfeng Shen
Jianhao Miao
Liangpei Zhang
Xinghua Li
Yanyuan Huang
Zhengshun Du
Publication venue: IEEE
Publication date: 01/01/2024
Field of study

Optical and synthetic aperture radar (SAR) images provide complementary information to each other. However, the heterogeneity of same-ground objects brings a large difficulty to change detection (CD). Correspondingly, transformation-based methods are developed with two independent tasks of image translation and CD. Most methods only utilize deep learning for image translation, and the simple cluster and threshold segmentation leads to poor CD results. Recently, a deep translation-based CD network (DTCDN) was proposed to apply deep learning for image translation and CD to improve the results. However, DTCDN requires the sequential training of the two independent subnetwork structures with a high computational cost. Toward this end, a concatenated deep-learning framework, multitask change detection network (MTCDN), of optical and SAR images is proposed by integrating the CD network into a complete generative adversarial network. This framework contains two generators and discriminators for optical and SAR image domains. Multitask refers to the combination of image identification by discriminators and CD based on an improved UNet++. The generators are responsible for image translation to unify the two images into the same feature domain. In the training and prediction stages, an end-to-end framework is realized to reduce cost. The experimental results on four optical and SAR datasets prove the effectiveness and robustness of the proposed framework over eight baselines

Directory of Open Access Journals